Integrative Transcriptomic and Single-Cell Protein Characterization of Colorectal Carcinoma Delineates Distinct Tumor Immune Microenvironments Associated with Overall Survival

Colorectal carcinoma (CRC) is a heterogeneous group of tumors with varying therapeutic response and prognosis, and evidence suggests the tumor immune microenvironment (TIME) plays a pivotal role. Using advanced molecular and spatial biology technologies, we aimed to evaluate the TIME in patients with CRC to determine whether specific alterations in the immune composition correlated with prognosis. We identified primary and metastatic tumor samples from 31 consented patients, which were profiled with whole-exome sequencing and bulk RNA-seq. Immune cell deconvolution followed by gene set enrichment analysis and unsupervised clustering was performed. A subset of tumors underwent in situ analysis of the TIME spatial composition at single-cell resolution through Imaging Mass Mass Cytometry. Gene set enrichment analysis revealed two distinct groups of advanced CRC, one with an immune activated phenotype and the other with a suppressed immune microenvironment. The activated TIME phenotype contained increased Th1 cells, activated dendritic cells, tertiary lymphoid structures, and higher counts of CD8+ T cells whereas the inactive or suppressed TIME contained increased macrophages and a higher M2/M1 ratio. Our findings were further supported by RNA-seq data analysis from the TCGA CRC database, in which unsupervised clustering also identified two separate groups. The immunosuppressed CRC TIME had a lower overall survival probability (HR 1.66, p=0.007). This study supports the pertinent role of the CRC immune microenvironment in tumor progression and patient prognosis. We characterized the immune cell composition to better understand the complexity and vital role that immune activity states of the TIME play in determining patient outcome.


Introduction
Despite advances in screening and early therapeutic interventions, colorectal carcinoma remains a signi cant cause of morbidity and is one of the leading causes of cancer-related deaths in both men and in women [1].Up to one-fourth of patients present with metastatic disease at the time of diagnosis and nearly half of patients who have had curable intent resection will still develop recurrence [2][3][4].Importantly, colorectal cancer (CRC) is a heterogeneous disease with a diverse molecular background that has signi cant implications for therapeutic approaches and overall survival.Indeed, unique tumor subsets have speci c therapeutic vulnerabilities, such as RAS mutant,mismatch repair de ciency,, or HER2 overexpression.. Advances in the eld of molecular pathology have greatly facilitated our understanding of the genomic and epigenomic landscape of CRC, with extensive work done by consortia such as The Cancer Genome Atlas (TCGA) and the international consensus molecular subtype classi cation consortium, which has enabled the classi cation of various subtypes of CRC according to their distinct molecular pro les and clinicopathologic features [5,6].This genomic heterogeneity may partially explain the observation that the aggressiveness of metastatic CRC varies widely among patients, with some harboring localized, resectable metastases that can be managed with curative intent whereas others progress rapidly.
Beyond tumor molecular pro ling alone, substantial evidence has highlighted the importance of the tumor immune microenvironment (TIME) in CRC, with some authors suggesting that immune features might serve as a better prognostic indicator than traditional histopathologic methods [7,8].An example of this, the immunoscore, developed by Galon et al. and based on the quanti cation of CD3 and CD8positive T cells using digital imaging analysis, was found to outperform the Tumor-Node-Metastases (TNM) classi cation for disease-free survival and overall survival and could help predict response to therapy [9][10][11].Other studies have found that mature T cells, dendritic cells, and memory T cells were associated with improved prognosis whereas regulatory T cells and M2 macrophages were associated with inferior prognosis [12,13].Since these initial pivotal studies, our ability to assess the TIME has signi cantly improved, which has led to a better understanding of its complexity.Recent advances in whole slide scanning, multiplexed single-cell in situ protein pro ling and machine learning algorithms have enabled more objective quanti cation.Additionally, single-cell technologies including RNA sequencing have led to further insights into the TIME [14,15].
The purpose of this study was to evaluate the TIME in patients with CRC to determine whether speci c alterations in the immune composition correlated with histopathologic features, molecular alterations, and overall prognosis.We found that CRC TIME could be characterized as either immune activated (higher proportion of activated dendritic cells, tertiary lymphoid structures, higher CD8 + T cells) or inactivated (inactivated dendritic cells, lower CD8 + T cells, increased M2 macrophages), and that the activated immune microenvironment is associated with improved survival.

Case selection and histopathology review
We included all patients with colorectal carcinoma who had been consented for molecular testing through the Englander Institute for Precision Medicine at Weill Cornell Medicine [16].Each sample, including primary and metastatic site, was reviewed by a certi ed pathologist to con rm the histologic diagnosis.Only fresh frozen samples with su cient RNA content and quality were included in the study cohort.Clinicopathologic data as well as therapeutic interventions and follow-up information was obtained through review of available surgical pathology material and the electronic medical record.This study was approved by the institutional review board (IRB) at Weill Cornell Medicine (IRB protocol #1305013903).

Immune cell deconvolution and TCGA cohort validation
We utilized xCell, a gene signature-based algorithm, for the deconvolution of bulk RNA-sequencing data to interrogate the tumor immune landscape [19].We evaluated transcriptome expression data for 64 immune and stromal cell types and the results were associated with WES results.Consensus clustering was performing using ConsensusClusterPlus R package.To validate our ndings, the TCGA colon cancer cohort [6] (n = 477) was also interrogated, and the results were correlated with Consensus Molecular Subtypes of CRC (https://portal.gdc.cancer.gov).

Immunohistochemistry (IHC)
Cases were re-reviewed by a certi ed pathologist with subspecialty training in gastrointestinal pathology.Histologic features including tumor architecture, tumor-stroma interface, immune in ltration, and stromal response were recorded for each case.On a subset of tumors IHC was performed on sections of formalin-xed para n-embedded (FFPE) tumor tissue using a Bond III automated immunostainer and the Bond Polymer Re ne detection system (Leica Microsystems, IL, USA).The following antibodies and conditions (dilution, antigen retrieval solution and pH, antigen retrieval time) were used: CD86 (CellSignaling Technology E2G8P Rabbit mAb, 1:100, Sodium Citrate buffer, pH6, 30 min) and CD80 (MyBioSource MBS4380739 Mouse mAb, 1:100, Sodium Citrate buffer, pH6, 30 min).

Imaging Mass Cytometry
Per the manufacturer's protocol, antibodies were conjugated in BSA and Azide-free format using the MaxPar X8 multimetal labeling kit (Standard BioTool).The antibodies were tested on control tissues (e.g., lymph node, tonsil) to validate the staining pattern as veri ed by study pathologists.Fifteen tumors (2 primary and 13 metastases) from 13 patients underwent in situ immune evaluation by imaging mass cytometry (IMC).The tissue samples were stained according to the manufacturer protocol following traditional immunohistochemical methods.Brie y, after 2 hours of warming the slides, these were depara nized in Citrisolv twice for 15 mins each followed by ethanol rehydration.Antigen retrieval was performed in a water bath for 30 minutes at pH 9.0.The slides were then washed and blocked using ThermoFischer SuperBlock solution for an hour.The slides were incubated overnight in antibody cocktail and washed the next day in TBS buffer and water.Lastly, DNA intercalator staining was performed by incubating the slides in 1:400 Ir solution for 30 mins at RT.The slides were washed and air-dried, ready to be scanned.Two regions of interest (ROI) were chosen per tumor, including tumor/normal interface when present.

Imaging Mass Cytometry Data Analysis
The IMC™ data was processed using the imc (v.0.1.4)package, available at [https://github.com/ElementoLab/imc] as we previously reported [20,21].The 29-channel image was reduced into a probability map of nuclei, cytoplasm, and background.Cells were segmented by applying DeepCell (v.0.2.1) to the probability map.To identify the expression pro les in the multiplexed images, each channel of every cell was mean-aggregated.The Scanpy library was utilized for downstream preprocessing.A log transformation and z-score normalization truncated at 3 standard deviations was performed to bring all channels to the same scale.This was followed by batch correction using Harmony (v.0.0.9) [10.5281/zenodo.7351719]to mitigate sample-speci c batch effects.To identify cell phenotypes, leiden clustering was applied with a resolution of 1.0 [22].Each resulting cluster was manually labeled using the matrixplot provided.For statistical analysis, the cellular density per ROI was calculated and a two-sided Mann-Whitney U-test was conducted, followed by a Benjamini-Hochberg multiple hypothesis correction to account for the repeated comparisons across cell types.
At the time of tissue acquisition, all but 3 patients had Stage IV disease, and most (77%) patients had previously received rst-line chemotherapy prior to tissue collection.Nine samples (9) were from the primary tumor and 32 samples were taken from metastatic sites.The molecular characteristics of this cohort were fairly representative of conventional colorectal carcinoma, with frequent alterations in KRAS (74%), TP53 (71%), and APC (51%); The majority of tumors (95%) were microsatellite stable.Consensus molecular subtypes 2 and 4 were most common (27% and 24%, respectively).The follow up interval from date of diagnosis to last contact was 56 months (range 4-168 mo).Twenty-two (70%) patients were alive, and the mean survival for the cohort was 40 mo (range 21-70 months).Colorectal Cancer is characterized by two clusters with distinct immune environments Gene set enrichment analysis followed by unsupervised clustering revealed two distinct clusters, which varied in their immune composition (Fig. 1).One cluster, comprised of 21 samples (18 patients), contained relatively high numbers of activated dendritic cells and Th1 cells (activated immune microenvironment), whereas a second cluster, which included 20 samples (14 patients), contained increased inactive dendritic cells and basophils (inactivated immune microenvironment).Correlating with the above ndings, dendritic costimulatory molecules and TH1 cell markers showed higher expression in the activated group (Supplementary Fig. 1).
The clinicopathologic features of the two subgroups are summarized in Supplementary Table 2. Patient demographics, tissue site (primary versus metastasis as well as liver vs non-liver metastasis), tumor grade and stage at diagnosis did not signi cantly differ between the groups, though right-sided tumors were more common in the suppressed immune group (p = 0.019).In addition, molecular analysis revealed more frequent KRAS mutations in the suppressed subgroup (75% versus 38%, p = 0.03) In keeping, CMS3 subtype was more common in the suppressed tumors while CMS2 subtype predominated in the activated group.
We then sought to con rm and further characterize these two immune subtypes by detailed histopathologic review and by Imaging Mass Cytometry.Fifteen samples (8 activated and 7 suppressed) underwent detailed histopathologic review.There was no difference in tumor architecture or histologic subtype, tumor budding, or necrosis.The presence of neutrophils, eosinophils, and plasma cells did not signi cantly differ between the two groups.However, tertiary lymphoid structures (TLS), by microscopic identi cation, were present in 5 tumors analyzed, 4 of which were categorized as activated.While these were numerous in two cases (both activated), most contained occasional TLS within the tumor periphery (Fig. 2a).In addition, a dense stromal response was present in 5 (71%) of the suppressed tumors whereas this response was only seen in 1 (14%) tumor within the activated group (p = 0.029) (Fig. 2b).Immunohistochemical staining for CD80 and CD86 (activated dendritic cells markers) was performed in representative cases, which showed strong positivity at the tumor stromal interface in activated tumors but not in suppressed tumors (Fig. 2c,d), supporting the RNA-Seq ndings.
We focused the multiplexed IMC™ analysis on the immune composition of each tumor.To quantify the in situ expression of 26 protein markers, IMC™ was performed on a total of 83,898 cells from 15 tumors (11.14 mm 2 , 35 regions of interest).The antibodies included in the panel are listed in Supplementary Table 1 and Fig. 3a.Qualitatively, IMC™ demonstrated a more robust T cell in ltrate throughout the peritumoral stroma within the activated group, with increased TLS, as was demonstrated on the hematoxylin & eosin-stained tissue slide.In suppressed tumors, the dense stromal response was highlighted, as well as an increased in macrophages (Fig. 2e,f).Quantitatively, CD8 T cells and macrophages were the most abundant in ammatory cells in the cohort (Fig. 3a, right panel).When comparing activated versus suppressed tumors, we found signi cant differences in the cell composition of the tumor microenvironment, with CD8 T cells being more frequent in the activated subgroup whereas macrophages were more often seen in the suppressed group, with increased M2/M1 ratio in the suppressed subgroup and increased expression of CD163 and CD206 (Fig. 3b,c).Dendritic cells were di cult to isolate through this methodology and the activity state could not be reliably determined, as signals for CD86 and CD80 by this methodology were weak and relatively nonspeci c.
We next examined the immune microenvironment as it relates to tumor location or metastases site.
Although the IMC analysis identi ed no signi cant difference in cell type proportion across primary or metastatic sites (Supplemental Table 2), we did observe that the immune activated v immunosuppressed clusters readily separate in both the primary tumor and in colorectal liver metastases clusters, whereas this distinction was less apparent with extrahepatic CRC metastases (Fig. 4).In patients with CRC liver metastases, we observe an in ammatory TIME enriched in CD8 and CD4 T cells and M1 macrophages, along with pro-tumor immune in ltrate of M2 macrophages, MDSCs and Tregs.The immune suppressed CRC liver metastasis cluster is also enriched in TGFb signaling and the immune checkpoint blockade signature 2 (Fig. 4), consistent with the observation that active liver metastases induce systemic immunosuppression and relative resistance to immunotherapy.Four patients had samples from multiple sites that underwent RNA-Seq analysis.Three patients had 2 samples each, whereas metastases from 5 distinct sites as well as the primary tumor were evaluated in one patient (Supplementary Table 3).In two patients, the metastatic tumors clustered similarly.However, in one patient the initial primary carcinoma was found to have a suppressed immune environment while the subsequent lung metastasis demonstrated an activated immune environment.Additionally, for the patient with 6 distinct tumor samples, all but one clustered as suppressed while the adrenal metastasis was classi ed as activated.

Distinct tumor immune microenvironments of colorectal carcinoma are associated with overall survival
To examine the activated and immune suppressed TIME CRC signature in a larger cohort, a similar gene set enrichment analysis with unsupervised clustering was performed on the publicly available RNA-Seq data from the TCGA primary colon carcinoma cohort, comprising 477 patients with newly diagnosed colorectal carcinoma.In contrast to our study cohort, these patients presented mostly with earlier stage disease and had not received therapy prior to sample acquisition.However, similar to the ndings in our cohort, unsupervised clustering did separate the TCGA tumors into two separate groups, which varied predominantly based on their inactivated to activated dendritic cell ratio (Supplementary Fig. 2).There was no signi cant difference in stage of disease between the two groups, though we did again identify a signi cant association between KRAS mutational status and increased inactivated dendritic cells (p = 0.006).Finally, we looked at the survival probability in relation to the dendritic cell ratio.Those with increased inactive dendritic cells (i.e. the immune suppressed TME) had a lower overall survival probability compared with those who had more activated dendritic cells (i.e.activated TME) (Fig. 5, HR 1.66, p = 0.007).

Discussion
Herein, we identify two distinct clusters of advanced carcinomas with unique immune microenvironment signatures.Our ndings suggest that this distinction is based predominantly on activation states of the immune cell composition, and largely independent of the CMS subtype or of driving oncogenic mutations.This study adds to the expanding literature which focuses on the importance of the tumor microenvironment in disease heterogeneity.Using currently available advanced technology, we demonstrate, similar to the ndings by Galon et al. in this cancer subtype, that an activated immune environment (including CD8 T cells but also dendritic cells, tertiary lymphoid structures, among others) may be associated with improved patient prognosis.Further studies utilizing these techniques to better characterize the activation state of various immune cell subpopulations will help to uncover whether tumors with activated immune environments may be more sensitive to checkpoint inhibition, offering additional therapeutic implications.
The tumor microenvironment comprises a complex network of cell types such as lymphocytes, dendritic cells, myeloid cells, broblasts, endothelial cells and extracellular components (cytokines, chemokines, etc.), all of which heavily in uence the ability of a tumor to grow and disseminate [23][24][25].The characteristics of this environment are thought to have signi cant prognostic and even therapeutic implications in numerous cancer subtypes including ovarian, colorectal, lung and breast, and several studies have shown that the in ammatory network in particular plays a pivotal role in the evolution of cancer [26-28].However, it is not just the density and cell type but also a cell's speci c phenotype which remains imperative in directing the immune response to be either immunostimulatory or immunotolerant, a feature that is becoming more evident as our ability to characterize these components advances.
Although we demonstrate that an activated CRC TIME is associated with improved survival, it is not clear at this time if this improved survival is due to improved bene t with cytotoxic therapy, or rather a feature of a less aggressive CRC phenotype.
Cancer cells have evolved multiple mechanisms to escape immune surveillance, such as defects in antigen presentation machinery, upregulation of negative regulatory pathways, and the recruitment of immunosuppressive cell populations which ultimately result in ine cient cytotoxic response.Our ability to characterize and de ne the TIME has signi cantly evolved.Earlier studies relying on manual inspection of glass slides and single surface marker analysis by immunohistochemistry were limited in their ability to assess the dynamic nature of the TIME.This lack of speci city in de ning subpopulations of cells may account for some of the inconsistencies in the literature between cell type and prognosis.Due to the complexity of the immune microenvironment that we now recognize, computational algorithms using bulk transcriptome data have become a popular approach to better understanding this interaction between tumor and host.The CMS classi cation scheme, for example, was one of the rst subclassi cation studies using RNA-Seq data in colorectal carcinomas to incorporate features of the immune microenvironment in addition to molecular alterations within the tumor, in order to derive subclassi cations [6].This is pertinent in colorectal cancer, as multiple studies have demonstrated a correlation between the presence of mature T cells and dendritic cells and a more favorable prognosis, the former being the basis for the Immunoscore [9,12,13,29,30].Similarly, the presence of B-cell and Tcell aggregates, also known as tertiary lymphoid structures (TLS) in the TIME, is considered a favorable prognostic feature as they promote T cell activation and antitumor effects [8,31].Dendritic cells also play a prominent role in tumor response, as they vary signi cantly in their immunostimulatory or immunosuppressive activity at different stages of cancer progression and are incredibly dependent on local signaling [32,33].In general, immature dendritic cells are peripherally located and non-mobile but are highly phagocytic and can be stimulated by various factors (including tumor antigens) to undergo maturation and activation, which then elicits an immune response by presenting antigens to T cells, inducing cytotoxic T lymphocyte immune response [34].They also secrete chemokines that cause NK cells and T cell migration into the tumor and cytokines that maintain cytotoxic functions [35,36].Not surprisingly, studies in a variety of solid tumors have found that increased dendritic cells, particularly those in the mature state with higher expression of costimulatory molecules including CD40, CD83, and CD86, are associated with improved prognosis [37][38][39][40].On the other hand, tumor-associated dendritic cells have been found to often exhibit impaired function, including decreased uptake and presentation of antigens, reduced expression of costimulatory surface molecules, and ine cient migration, all of which may in reduced immune response and tumor evasion [41].Studies have shown that tumor cells can release cytokines such as IL-6, IL-10 and TGF-B that cause dendritic cells to remain in an immature or immunosuppressive stage [33,[42][43][44].There are also metabolic inhibitors including IDO and arginase, which are produced by tumor associated macrophages (particularly the M2 phenotype) and upregulation of receptors such as CTLA4 and PD-1, all of which may be contributing to tumor immune escape [45,46].
The ndings of the current study are unique in that we were able to evaluate a speci c cohort of patients with advanced colorectal carcinoma, mostly Stage IV, who had already failed rst line therapy.Within this cohort, we rst identi ed two distinct groups through unsupervised clustering that differed predominantly in the activation state of the dendritic cells and other immune cells.Our ndings are in keeping with other studies which have found an association between increased Th1 and activated dendritic cells and improved prognosis, whereas increased inactivated dendritic cells are generally associated with progression of disease [7,11,47].Prior research has also characterized KRAS mutant CRC as having limited cytotoxic T cell in ltration, reduced T helper 1 responses, and reduced INF-gamma signaling, generating an overall immunosuppressive tumor microenvironment phenotype, which correlates with our nding of increased KRAS mutations in the subgroup with a suppressed TIME [48,49].
Another unique nding of the current study is that the reactivity state of the TIME did not depend on site of disease, either primary versus metastasis or location of metastasis.This is an important point, as prior studies have suggested that liver metastases are associated with immunosuppression, and patients with liver metastases have been shown to have minimal response to systemic immunotherapy [50].However, we identi ed metastatic liver tumors with both activated and suppressed TIME, suggesting that a subset of patients with hepatic involvement may bene t from immunotherapy.
There are several recognized limitations of the current study.The cohort of patients with advanced colorectal carcinoma in which su cient material available for RNA-Seq was relatively small.In addition, while review of the histomorphology and IMC™ analysis were performed in order to con rm our ndings in situ, this comparison was somewhat limited by the fact that the FFPE tissue areas used for histopathology, IHC and IMC™ analysis may not exactly correspond to matched frozen tissue areas used for bulk RNA-Seq.Second, the IMC™ analysis was limited to discrete regions of interest that may not be representative of the entire tissue sample.Lastly, we were unable to fully characterize the immune population through IMC™ given the lack of additional immune markers in the available panel.Additional studies that can isolate dendritic cells and activations states are warranted.
In conclusion, this study supports the pertinent role of the immune microenvironment in tumor progression and patient prognosis, even within a group of patients with advanced stage disease who have failed multiple lines of therapy.We characterized the immune cell composition utilizing novel, more advanced technologies such as RNA-Seq and IMC™ in order to better understand the complexity and vital role that activity states play in determining response to tumor in ltration, thereby affecting response to therapy and overall patient outcome.Immunohistochemical staining for CD86 demonstrates positivity in the immune cells for an immune activated tumor (c), whereas a tumor with an inactivated immune microenvironment is negative (d).
a. Quantitative analysis of Imaging Mass Cytometry demonstrates that CD8 T cells and macrophages were the most abundant in ammatory cells in the colorectal carcinoma cohort.No signi cant difference in cell type proportion was noted across primary or metastatic sites.b.When comparing activated versus inactivated tumors by transcriptomic analysis, we found signi cant differences in the cell composition of the TIME between the two groups, with CD8 T cells being more frequent in the activated subgroup whereas macrophages (predominantly M2 subtype, c) were more often seen in the inactivated group.TIME = Tumor immune microenvironment.
Gene expression heatmap of the tumor immune microenvironment (TIME) of primary and metastatic colorectal carcinoma (CRC) aligned by CMS molecular classi cation and anatomic site.The expression value is log-transformed and median centered for selected genes.Assigned activated and suppressed clusters are represented on top.Supervised consensus clustering of our CRC tumors according to a 170immune gene signature classifies tumors into cluster C1 that contains a higher number of activated dendritic cells and Th1 cells (activated TIME), and cluster C2 that contains increased inactive dendritic cells and basophils (inactivated TIME).Supplementary Files

Figures
Figures

Figure 1 a
Figure 1

Figure 2 Review
Figure 2

Table 1
Clinicopathologic features of the study cohort.
DeclarationsCon icts Interest: O.E. holds equity in OneThree Biotech, Volastra Therapeutics, Owkin, Champions Oncology, Pionyr Immunotherapeutics, Harmonic Discovery and Freenome.The remaining authors declare no competing interests.No patents have been led or are related to this manuscript.